Overview

Dataset Statistics

Number of Variables 4
Number of Rows 16000
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 1.7 MB
Average Row Size in Memory 113.3 B
Variable Types
  • Numerical: 3
  • Categorical: 1

Dataset Insights

item_id is skewed Skewed
rating is skewed Skewed
name has a high cardinality: 3304 distinct values High Cardinality
rating has 3058 (19.11%) zeros Zeros

Variables


user_id

numerical

Approximate Distinct Count 12344
Approximate Unique (%) 77.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 256000
Mean 36773.0579
Minimum 5
Maximum 73515
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • user_id is skewed left (γ1 = -0.0099)

Quantile Statistics

Minimum 5
5-th Percentile 4014.8
Q1 19092.75
Median 37007
Q3 54683
95-th Percentile 69263.15
Maximum 73515
Range 73510
IQR 35590.25

Descriptive Statistics

Mean 36773.0579
Standard Deviation 20964.4722
Variance 4.3951e+08
Sum 5.8837e+08
Skewness -0.009872
Kurtosis -1.1957
Coefficient of Variation 0.5701

item_id

numerical

Approximate Distinct Count 3304
Approximate Unique (%) 20.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 256000
Mean 8863.7283
Minimum 1
Maximum 34240
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • item_id is skewed right (γ1 = 1.0053)

Quantile Statistics

Minimum 1
5-th Percentile 121
Q1 1292
Median 6156.5
Q3 13939
95-th Percentile 28223
Maximum 34240
Range 34239
IQR 12647

Descriptive Statistics

Mean 8863.7283
Standard Deviation 8835.4025
Variance 7.8064e+07
Sum 1.4182e+08
Skewness 1.0053
Kurtosis 0.03114
Coefficient of Variation 0.9968
  • item_id is not normally distributed (p-value 3.736046699266508e-18)
  • item_id has 38 outliers

rating

numerical

Approximate Distinct Count 11
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 256000
Mean 6.319
Minimum 0
Maximum 10
Zeros 3058
Zeros (%) 19.1%
Negatives 0
Negatives (%) 0.0%
  • rating is skewed left (γ1 = -1.0103)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 5
Median 7
Q3 9
95-th Percentile 10
Maximum 10
Range 10
IQR 4

Descriptive Statistics

Mean 6.319
Standard Deviation 3.3807
Variance 11.4295
Sum 101104
Skewness -1.0103
Kurtosis -0.3883
Coefficient of Variation 0.535
  • rating is not normally distributed (p-value 3.150134838986272e-10)

name

categorical

Approximate Distinct Count 3304
Approximate Unique (%) 20.6%
Missing 0
Missing (%) 0.0%
Memory Size 1443872

Length

Mean 22.8207
Standard Deviation 13.9499
Median 19
Minimum 1
Maximum 98

Sample

1st row Naruto
2nd row Naruto
3rd row Naruto
4th row Naruto
5th row Naruto

Letter

Count 306776
Lowercase Letter 255403
Space Separator 43187
Uppercase Letter 51373
Dash Punctuation 1801
Decimal Number 3072
  • name contains many words: 4538 words

Interactions

Correlations

Missing Values